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1 Abstract 

2 Individual Based Models (IBMs) and Agent Based Models (ABMs) have become widely used tools to 

3 understand complex biological systems. However, general methods of parameter inference for IBMs 

> 

4 are not available. In this paper we show that it is possible to address this problem with a traditional 
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5 likelihood-based approach, using an example of an IBM developed to describe the spread of Chytrid- 

6 iomycosis in a population of frogs as a case study. We show that if the IBM satisfies certain criteria we 

7 can find the likelihood (or posterior) analytically, and use standard computational techniques, such as 

q s MCMC, for parameter inference. 
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12 Individual or Agent Based Models (IBMs and ABMs, respectively) are widely used in ecology to under- 

o 

13 stand how interactions between freely acting agents, or between components of a complex system, result 

14 in larger scale patterns and behavior 1151 . Typically, the foundation of an IBM consists of some kind of 

X 

15 sub-model for "individuals", ranging from differential equations to heuristic rules. Multiple individuals 

03 

16 are then allowed to interact via other rules within the virtual space. Because they are fairly easy to 

17 formulate, are enormously flexible, and allow us to inherently incorporate individual variability, IBMs 
is and ABMs have become increasingly popular for biological and ecological applications lfT4l [201 . For 

19 instance, they have been used to describe swarming or herding behaviors of animals ifTTl l34l . biofilm 

20 growth and formation J22j|23l, and fish population dynamics (24l|27l, and white-nose syndrome in bats 

21 @. 
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22 Although IBMs are useful for understanding the qualitative behavior of systems, quantitative analy- 

23 sis is less common. Although there are various reasons for this [46], a major factor is that model output 

24 is typically complex, and thus difficult to analyze and compare with data. Frequently the approach has 

25 been to use experimental or observational data on individuals within a population to obtain rough pa- 

26 rameter estimates (such as birth or death rates) first, and then to use the IBM to explore the population 

27 level effects of various flavors of interactions given those parameters. However, in ecological settings, 

28 sometimes only system level observations are available, i.e., we cannot easily observe individuals un- 

29 coupled from each other or from the environment. Even if laboratory experiments on some parameters 

30 are available, these estimates may not translate to field conditions [45]. Thus, we would like to be able 

31 to use data from observational studies or other kinds of experiments that record individual level data in 

32 order to infer parameters in an IBM indirectly. 

33 The recently developed approach of Pattern-Oriented Modeling (POM) lfl6l l46l has been suggested 

34 as a way to standardize the analysis of IBMs, and facilitate indirect parameter inference. In practice, 

35 the procedure requires careful thinking about appropriate methods to compare model output and data. 

36 However, discussion of this, either inside or outside the POM framework, has been mostly ignored. 

37 Most methods for comparing model and data patterns for IBMs have been either qualitative or fairly 

38 ad-hoc, such as simple sensitivity analyses where parameters are perturbed over small ranges (e.g. [24]). 

39 Although this may be sufficient for some applications, for others, such as the one we explore here, a 

40 more quantitative approach is required. In particular, we want to be able to perform parameter inference. 

41 For other mechanistic models, such as state-space models |fTT|. methodologies for parameter infer- 

42 ence via likelihood-based statistics are well-developed [9[. However, these tools have not been utilized 

43 for IBMs. In this paper we show that it is possible to derive a likelihood for classes of IBMs that have cer- 

44 tain properties. We use a model of the dynamics of a fungal pathogen, Batrachochytrium dendrobatidis 

45 (Bd), on and between frogs as a motivating example. First we introduce our case study and the IBM for 

46 which we would like to infer parameters. Next we discuss the derivation of the the likelihood, and show 

47 that we are able to infer parameters from simulated data using Markov chain Monte Carlo (MCMC). Fi- 

48 nally we discuss under what circumstances practitioners should consider the approach: when it is likely 

49 to be possible to derive a likelihood in general, and how it can be practically extended for other models 
so and data. 
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5 i 2 Model 



52 Chytridiomycosis, a fungal disease of amphibians, has been implicated in population declines and ex- 

53 tinctions in amphibian populations worldwide JH|4T1. Efforts to understand the pathogenesis and spread 

54 of the disease have been wide ranging, including intensive observation of field populations C51I3T1I441 . 

55 laboratory experiments EH471, and mathematical and statistical modeling J4l|5l|30l|40l. However, many 

56 aspects of the disease, such as infection rates (both between and within frogs) or reproductive rates of 

57 the fungus, are very difficult to measure directly, especially in natural populations (e.g. Il37ll ). These 

58 factors are crucial for understanding why some populations are heavily affected by the disease 

59 while others are not (SHH, and for designing effective control strategies for the disease [38]. 



so 2.1 The dynamics of Chytridiomycosis in a population 

61 Bd is transmitted via an aquatic zoospore stage 0|36l, which infects keratinized cells on amphibian 

62 skin (or tadpole mouthparts; Il26l0 . A proportion of the water-borne zoospores that encounter a cell suc- 

63 cessfully infect the cell and develop into a sporangia. These sporangia then release zoospores back onto 

64 the surface of the skin Q, where they can either reinfect the same individual, or enter the zoospore pool 

65 (the water containing zoospores, in which frogs live). Thus, infection within an individual and trans- 

66 mission between individuals are linked through a single process. A series of IBMs (both deterministic 

67 and stochastic) describing this process were developed by [5]. Here we explore a version of this model 

68 for the spread of the disease within a single season, without stage (or age) structure or reproduction. 

69 The deterministic, continuous time dynamics (shown schematically in Figure[T] together with parameter 

70 definitions) are described by the following set of differential equations: 



dt 



77 Z + cSi for S{ < S max 

(1) 

and frog i dies for Si > s max 
1 

V 



all living frogs, i 



71 We are most interested in a discrete time stochastic version of this model [5]. To arrive at this 

72 model, time is first discretized into short steps of length At. Integrating Equations ([TJt-Q over the 

73 time increment t to t + At, holding the numbers of sporangia and zoospores fixed at the values at the 

74 beginning of the time step (Si(t) and Z(t)) approximates the mean of our stochastic process, which we 
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75 choose to model as Poisson, since the numbers of sporangia and zoospores are counts. Specifically, the 

76 number of sporangia Si(t + At) on individual i (if it is still alive) at time t + At, given the sporangia 

77 and zoospore levels at the previous time (Sj(t) and Z(t)), follows 

Si(t + At) ~ Pois(A;(t + At)), where (3) 
\i(t + At) = E[Si(t + At)] = S l {t)e cAt + (e cAt - l) (4) 

78 Individuals die between t and At if the number of sporangia on the frog exceeds the lethal threshold, 

79 so that Si(t) > s max , at which point all of the sporangia on the individual frog die as well (so that the 
so sporangia load is defined to be zero). 

si The zoospore level at time t + At depends on the zoospore level in the pool and the sum of sporangia 

82 across frogs, both at time t. The total zoospores in the pool at time t + At has a Poisson distribution 

Z(t + At) ~ Pois(£(t + AT)), where (5) 

£(t + At) = E[Z{t + At)] = Z(t)e- C3At + — (1 - e~ C3At ) (6) 

£3 

83 with C2 = St=i d^ity) an d C3 = fj, + The model output for our likelihood based analysis is the 

84 sporangia loads on frogs and zoospore levels in the pool at each time step, and the times at which frogs 

85 die. 

as 2.2 Likelihood-based approach 

87 Many mechanistic models in ecology are formulated as deterministic differential equation models, which 

88 are frequently fit using non-statistical techniques, such as non-linear least squares lTT9ll43l . The use of 

89 Approximate Bayesian Computation has also been proposed for deterministic models [42]. Mechanistic 

90 models that are inherently stochastic or incorporate stochastic elements have also become popular to 

91 describe both populations and individuals. These types of models have been fit using statistical, typically 

92 likelihood based, methods. Some examples of these methods for biological models include maximum 

93 likelihood via iterated filtering [21] and flavors of particle filtering ifTTl [TBI . However, thus far these 

94 methods have not been used for parameter inference for IBMs. 

95 In the classical or Bayesian statistical frameworks, the approach is to first find the analytic form of 

96 the likelihood, which is defined as the probability of observing the data under a parameterized model. 

97 Parameter inference, given data, then proceeds by finding maximum likelihood estimators (MLEs) or 
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98 by sampling from the posterior distribution under particular priors. Each of the two approaches has its 

99 benefits. We find the Bayesian posterior summaries of uncertainty to be more intuitive compared to the 

100 classical approach of reporting confidence intervals, and so this is the approach we present here. MLEs 

101 and confidence bands could also be obtained using alternative computational approaches. 

102 To infer the posterior probabilities of a vector of parameters 6 = py, fj,, f3, c, g, s ma x 

l, we need to 

103 compute (and evaluate) the likelihood Pr((S, A, -Z*)i:T ma J#) where (S, A, ^)i ; T max is the full data avail- 

104 able on ./V frogs over from time 1 to T max . For simplicity, in the following derivation we assume the 

105 data are observed without error, but with stochasticity inherent in the dynamics. Specifically, S is an 

106 N x T max matrix containing the sporangia counts on all individuals; A is an N x T max matrix that 

107 indicates whether individuals are alive (1) or dead (0); and Z is a vector of length T max which records 

108 the counts in the zoospore pool, at each time step. S and A could also be viewed as a collection of iV 

109 vectors with length < T max , one for each frog. For the calculations below, we set At = 1 day. The 
no derivation of the likelihood relies primarily on two conditions: the Markov property of the system and 

1 1 1 conditional independence of observations on individual frogs and the zoospore pool at a given time. By 

112 conditional probability, the likelihood can be re-written as: 

L = Pr((S,A,Z) 1:Tm j0) (7) 
= Pr((S, A, Z) Tin J (S, A, Z)i : T mM -i, 0) X Pr((S, A, Z)^^) 
= Pr((S, A, Z) Tin J (S, A, Z)i:T mM -i,0) 

x Pr((S, A,Z) Tmax _ 1 |(S,A,Z) 1:Tmax _ 2 ,0) x Pr((S, A, Z) liTmx - 2 \0) 

= [jPr((S,A,Z) t |(S,A,Z) w _ 1 ,0). (8) 

t=2 

113 In our case the new zoospore levels in the pool and sporangia loads on the individual frogs depend only 

114 on the levels and loads at the previous time step. In other words, the process exhibits a Markov property, 

115 such that Pr((S, A, Z) t |(S, A, Z) lit -i,0) = Pr((S, A, Z) t \(S, A, Z) t _ x , 9), and so the likelihood can 

116 be re-written as 

L= H Pr((S > A,Z) t |(S > A,Z) t _ 1 ,0) 

t=2 

117 If we look at the dynamic of the process over a single time step (Equations ([3]) - ([6]) ), we can see that, 
us given the information about the state of the system at time t — 1, the zoospore load in the pool at time t 

5 



119 is independent of the sporangia loads observed on frogs at time t. That is 

L= J] Pr((S,A) t |(S,A,Z) t _i,0)xPr((Z) t |(S,A,Z) t _i,e). 

t=2 

120 Moreover, the observed levels on each frog are conditionally independent of each other. In other words, 

121 given the state of the system and parameters at time t — 1, (S, A)t and Zt are conditionally independent, 

122 as are any pairs (S, A)i )t and (S, A)jj where i ^ j. Furthermore, the state of individual i at time t only 

123 depends on the parameters, its own state, and the reservoir at the previous time step (i.e. (S, A)a is 

124 independent of (S, A)j t-i for i ^ j) so we can write the likelihood as 

T max / N \ 

L = II [U PT ((^ A k^ A k^ Z ^ d )) xPr((Z) t |(S,A,Z) t _i,fl). (9) 

t=2 \i=l J 

125 The second term in the above product is the Poisson distribution described in Equation ([5]) with mean 

126 given by Equation ([6]). Then the likelihood can be written as 

L = ft (Ums,AU(S t A)^ u 2^ lt e)\ XIJ*|P. (10) 

127 We can reverse the ordering of the products over i and t in the first term of the above expression, and 

128 write things more compactly as 

^nfff^)^n t( ^'y (t)) . 

i=l \ t=2 / t=2 W ' 

129 where 

V itt = Pr((5, A) iit |(5, A)<, t _i, Z t _i, 0) (12) 

130 is the probability of observing the data on the i th frog at time t conditional on the data at the previous 

131 time step and the parameters. By conditional probability, we can write this as 

V iit = PT(S i ,t\A i>ti (S i A)i,t-i,Z t - 1 ,e) x Pr(4,t|(S,A) M _i,Z t _i,0). (13) 



132 The probability that the individual is in state Ai tt = {0, 1} is distributed as a Bernoulli random variable, 

133 conditional upon the parameters, and all the data at time t—1, 



PT(A iit \(s,A) i>t - 1 ,z t - 1 ,e)= P ?;> t (i- Pi;t y- A ^, (i4) 



J i,t 

134 where the probability of a success, p^t, defined as the frog being alive at time t {Ai t = 1). i s tne 

135 probability that the number of sporangia on the frog will less than s max , i.e., 
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Pi,t 



AW*)) 



(15) 



136 where F x (q) denotes the cumulative distribution function up to x for the Poisson distribution with pa- 

137 rameter q. Conditional on Aij, then, the probability of observing the sporangia load on the frog is given 

138 by 

f 

Pois(Ai(t)) if Ai.t = 1, 



Pr(5 iit |A i)t , (S, A) i)t -i, Z t -i, 



(16) 



S, 



(S it t,0) 



if Ai.t = 0. 



139 where 5( S . t>0 ) is the Kronecker delta function. Note also that, for completeness, we can specify that 

140 Pr(A» t = 0| Aij-i = 0) = 1, although the data we observe ensure this. Let Ti be the last time that the 



141 frog is observed alive. Using this together with Equations 14 and 16 we can then re- write the expression 

142 for T>i t as 



'Xi(t) s i,t exp(-Aj(t)) 



1 - Pi,t 
1 



x p ijt if t < Ti 

\it = Ti + \ 
otherwise 



(17) 



143 Incorporating the expression for T>ij into Equation[TTJ we arrive at an expression for the full likelihood: 



pr((s, A,z) 1:Tm jd) = n n v ^ x n 



^(i) Z(i) e X p(-e(t)) 



i=\ \ t=2 
N /Ti 



t=2 



Z(t)> 



n n 

1 \t 

n 



1=1 \t=2 
T, 

X 

t=2 



\i(t) s '>t exp(-Ai(t)) 



5 M ! 

e(t) Zt exp(-C(t)) 



X p i)t X (1 - Pi,Ti+l) 



(18) 



144 2.3 Simulation Experiment 

145 Once the likelihood has been derived, various approaches could be used for parameter inference. As an 

146 example of how this can proceed, we seek to infer parameters using a single stochastic realization of 

147 the model described in Section [2TT| as data. The simulation consists of 30 individuals in a single pool of 
us unit volume. Initially, the pool is free of zoospores, and all individuals, save one, are uninfected. The 
149 infected individual that acts as the source of the infection has a load of 100 sporangia. Parameter values 
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150 used in the simulation are given in Table [T] This data realization is shown in Figure [2] 

151 We attempt to infer parameter values for three cases: data without noise, data with normal noise, and 

152 data with lognormal noise. In the first case we assume that we can monitor both the level of the infection 

153 on individual frogs as well as zoospore levels within the pool (or reservoir) without any observation 

154 error (i.e. all stochasticity is due to process error). In the second and third cases we add a small amount 

155 of observational noise to the data to see if the inferential methods are robust. In our second case, we 

156 consider additive noise so that the observed values, Si, Z are given by = Si + ei and Z = Z + £i 

157 where ei,£i ~ N(0, 10). In the third case we assume multiplicative noise, so that Si = Sie €2 and 

158 Z = Ze& where €2, £2 ~ N(0, 0.01). Throughout, we assume that we do not have false positives, i.e., 

159 we correctly identify uninfected individuals as having zero fungal load (Si = 0). 

160 The likelihood is a sufficiently complicated function of the data and parameters that analytical meth- 

161 ods are not feasible. Thus we turn to numerical methods, specifically Markov chain Monte Carlo 

162 (MCMC) methods in the Bayesian context. We specify weak priors on all parameters (see Table [T]) 

163 and sample from the posterior distribution of the parameters using a variation of the random walk 

164 Metropolis-Hastings (MH) algorithm. A short description of the algorithm used and its implementa- 

165 tion can be found Appendix [A] See [9] for further details of these sampling approaches in the context 

166 of ecological problems. For all cases, samples of 20000 draws from the posterior distribution were 

167 collected. 

168 3 Results 

169 We begin with inference for the system without observational error. Figure|3]shows the samples from the 

1 70 joint posterior distributions plotted for pairs of parameters in this case. The marginal posterior (and prior) 

171 distributions of the parameters are shown in Figure [4] As is clear in the figure, we were able to get very 

172 good posterior distributions for all of the parameters, in that the posterior 95% credible interval includes 

173 the "real" parameter values used to generate the data and the posterior has low variance. Additionally, 

174 we can see that the prior distributions for the parameters in the regions of high posterior probability were 

175 very uninformative (i.e. the priors are low and flat in these regions). 

176 Next we examine the two cases where additional observation error is added to the data, but the model 

177 is not extended to include an observational model. Marginal posterior distributions of the parameters 

178 for both of these cases are shown in Figure [5] Notice that although the posterior distributions are quite 
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179 different from the prior (i.e., the data are informative for the parameters) only some are infered with 

180 similar accuracy to the case without noise. In particular, s max is reasonably accurately inferred in both 

181 cases. However, as there is additional variability in the noisy data compared to what would be typical 

182 in output from the model with "correct" parameters, some parameter estimates are biased. Which pa- 

183 rameter estimates are biased depends upon the type of noise. For the case with additive, normal error, 

184 estimates for c and f3 (mostly effecting the dynamics of sporangia on frogs) are biased, whereas for the 

185 multiplicative log-normal error, g, 7, and fj, (mostly effecting the dynamics of the zoospore pool) are 

186 biased. 

187 This pattern of bias is related to how the error enters into the model, which effects the relative size of 

188 the noise compared to the signal for sporangia and zoospore counts. The zoospore levels in the pool are 

189 typically more than two orders of magnitude higher than the sporangia levels on an individual frog (see 

190 Figure[2]). Thus for the additive noise case, the relative errors in the sporangia counts is larger than for the 

191 zoospore levels. Since the observational noise is smaller than the process error the posterior estimates 

192 for the parameters determining the zoospore dynamics are largely unchanged. However, for the log- 

193 normal noise, this is not true. Instead, as the value of the counts increases, the absolute error increases 

194 as well, similarly to how the process stochasticity increases with the mean. For the zoospore load in the 

195 pool, the noise is able to skew the parameter estimates that are primarily informed by this time series. 

196 Although we might expect similar results for the sporangia on frogs, since there are many trajectories 

197 being observed simultaneously, the estimates of parameters related to sporangia dynamics are less likely 

198 to be biased in any particular direction. Observe, that the overall magnitude of the multiplicative noise 

199 is larger than in the additive case, and thus the accuracy of the estimates for all parameters is poorer. 

200 4 Discussion 

201 Mathematical models in ecology and biology play many roles, from providing short term prediction, to 

202 acting as tools for furthering our general understanding of biological mechanisms. Frequently, as has 

203 been the case with most IBMs, mechanistic models have been constrained to be used for qualitative 

204 understanding as methods for fitting them to data have not always been available. However, as we have 

205 shown here, it may be possible to describe an IBM using a likelihood based approach and perform 

206 indirect parameter inference from data for quantitative predictions. 

207 Finding an appropriate likelihood for an IBM requires an additional investment above and beyond 
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208 the current practice of building a simulation model. Thus, the choice to take this approach necessarily 

209 depends upon the goal of the modeling exercise. If the primary objective is to understand the range 

210 of system behaviors for different types of individual actions and interactions then this approach may be 

21 1 overkill. However, if there is a real desire to be able to perform parameter inference from data, especially 

212 in cases where knowledge of the uncertainty in the estimates is desired, the investment is likely to yield 

213 more satisfying results than other approaches currently being used. 

214 In the system explored here, there are a number of reasons why we particularly want to be able 

215 to estimate parameters, instead of just understanding the behavior of the model or making qualitative 

216 predictions. For instance, we want to better understand the biological differences between strains of the 

21 7 fungus, between species of frogs affected by the fungus, the effects of temperature variation on the spread 

218 of the epidemic, etc. These differences manifest themselves as variation in parameter values, which are 

219 now estimable. Knowing both parameter estimates, and the uncertainty in these estimates, allows the 

220 design of effective intervention strategies lfl2l|29l . The approach shown here has the additional feature 

221 that by explicitly taking into account interactions between individuals we can utilize all available data 

222 for parameter inference while reducing concerns of pseudo-replication. 

223 The general assumptions that allow us to find the likelihood for the IBM presented here would be 

224 applicable to many, though not all, other IBMs. Certain assumptions in many IBMs could make this 

225 approach more difficult. For instance, models where individuals exhibit behavioral "rules of thumb" or 

226 models that include spatially explicit stochastic movement and interactions between individuals, may 

227 be considerably more difficult to formulate in this framework. However, even in these cases it may be 

228 possible to find an appropriate likelihood model for the data (for instance see Patterson et al. §35\ ). or to 

229 consider simpler or non-heuristic version of the model that is amenable to a likelihood based analysis. 

230 For the IBM described in this paper the most important simplifying assumptions are: 1) that transitions 

231 are described by parametric distributions; 2) that individuals are conditionally independent at the current 

232 time step, so that their current state only depends on the states of other individuals at previous times; 

233 3) the system exhibits a Markov property (the state at time t only depends on the state at time t — 1). 

234 In fact, when the Markov property holds, this type of IBM could be viewed as a type of statistical 

235 state-space model (SSM) ©El [33, and thus one could use many of the computational tools that have 

236 already been developed by the statistical community for analysis of SSMs [? [33). Because of this, we 

237 envision straightforward parametric inference when various extensions are added to the model presented 
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238 here. For instance, observational uncertainty, hidden states, hierarchical structures (all three of which 

239 are standard in SSMs), and more biological complexities, are easily incorporated. 

240 Of the extensions mentioned above, the addition of an observation model is likely to be the most 

241 important, since, as we showed, ignoring this source of stochasticity can lead to bias in parameter es- 

242 timates. The addition of an observation model is conceptually simple when the model is treated as an 

243 SSM with an unobserved state process. However, when portions of the state are hidden, either because 

244 of the inclusion of an observation model, or because there is no data gathered for a portion of the pro- 

245 cess, inference may be more difficult. This may be because there is not enough data to infer the hidden 

246 state, or because the computational routines necessary for inference are more involved. Additionally, the 

247 tools best suited for inference in these cases are frequently less well known to ecology community. For 

248 instance, MCMC and Sequential Importance Sampling (SIS), both standard methods in statistics, have 

249 been used for Bayesian inference for SSMs [33], and these approaches are slowly being introduced in 

250 the ecological literature. 

251 Although some types of extensions would typically require different computational techniques to 

252 perform the analysis, not all extensions have this limitation. Since IBMs are typically designed to 

253 explore how small differences between individuals propagate through the system, the addition of hi- 

254 erarchical modeling iflOl l28ll . for instance to explicitly include variation in parameter values between 

255 populations or individuals, would be a natural extension that is still tractable with simple methods. Ad- 

256 ditionally, various other biological details, such as age or state structure, could be incorporated into 

257 the modeling framework fairly easily, as long as the Markov property, etc., are maintained. Thus, al- 

258 though the computational burden of likelihood-based inference for this approach maybe more involved 

259 than simply performing forward simulations of the IBM, this cost can be more than offset by the ability 

260 perform robust quantitative analysis and parameter inference. 
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264 A Sampling from a Distribution via the Metropolis Hastings or Metropolis- 

265 within-Gibbs algorithms 

266 The Metropolis-Hastings (MH) algorithm is a type of Markov chain Monte Carlo (MCMC) simulation 

267 commonly used to sample from distributions, particularly posterior distributions in Bayesian statistics, 

268 when other sampling methods are not available. Details on MCMC and the various sampling methods 

269 discussed below can be found in Clark [9J. Here we give a very brief outline of the sampling methods 

270 utilized in this paper. In the most basic sense, the MH algorithm begins at some parameter setting, 

271 proposes new values with a user-specified probability related to the location of the parameters relative 

272 to the original sample, evaluates the posterior probability density of the new sample, then "accepts" 

273 the new sample probabilistically, where this probability is proportional to the posterior of the last and 

274 proposed samples and the probability of the proposed samples. In particular, the acceptance ratio for the 

275 MH algorithm is defined to be , v , , , 

p{9 l )/j(9 l \e*) 

276 where 9 l is the last accepted draw of the parameters, 9* is the proposed parameter set, p(-) denotes the 

277 distribution we are trying to sample from, in our case the posterior distribution, and J(9 2 \9 1 ) the jump 

278 (or proposal) distribution, which gives the probability of proposing parameters 9 2 given that the current 

279 parameters are 9 l . More specifically, the algorithm is as follows: 

280 • Choose an initial setting of the parameter vector 9^ 

281 • For t in to T max 

282 1. Set#( T+1 ) = #M 

283 2. propose a new value of the parameters 9* 

284 3. evaluate the MH acceptance ratio A(9 l \9*) (defined above) 

285 4. choose u ~ Unif(0, 1) 

286 5. accept the proposed parameter if u < minj^, 1}, i.e., set #( T+1 ) = 9* (otherwise 6>( r+1 ) 

287 remains fixed at 9^) 

288 The Metropolis-within-Gibbs algorithm is a variant of the MH algorithm which allows one to update 

289 parameters individually, instead of all at once. The algorithm proceeds as follows: 
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290 • Choose an initial setting of the parameter vector 

291 • For r in to T max 

292 - Set #( T+1 ) = 0< T ) 

293 - For i in 1 to N: 

294 1. propose a new value of the i th parameter 9*, so that 9* = 9-i U 9*. 

295 2. evaluate the MH acceptance ratio A(9 l \9*) (defined above) 

296 3. choose u ~ Unif(0, 1) 

297 4. accept the proposed parameter if u < min{^4, 1}, i.e., set 9 ( f +1 ^ = 9* 

298 Additionally, instead of proposing parameters only one at a time, parameters can be chosen in blocks of 

299 multiple parameters at a time. This can be especially useful when parameters are correlated and a joint 

300 jump distribution is used, as this can improve mixing in the chain. 

301 For our model, we choose to use the MwG algorithm. The likelihood used was that derived in Section 



302 ??. Proposal distributions for the parameters were normal, centered at the previously accepted parameter 

303 value (or multi-variate normal, for jointly proposed parameters), with variances (and covariances) that 

304 were tuned in preliminary runs so that an appropriate balance of acceptance and mixing was achieved. 

305 Figure[3]shows the samples from the joint posterior distributions plotted for pairs of parameters obtained 

306 using this method. Although most pairs of parameters are fairly independent, the exceptions are the 

307 strong correlations between c and (3 (or in the plot In (3) and between g and 7 (or log 10 7). Thus we 

308 chose to jointly propose g with log 10 (7)), which significantly improved mixing. We observed that even 

309 when the Markov chain was initialized away from the true values the sampler moved fairly quickly into 

310 the appropriate portion of parameter space. For the two cases with error, a burn-in of approximately 

311 1000 samples seemed to be sufficient. 
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parameter 


value 


prior 


posterior 95% C.I. 


-logio(7) 


3 


Gamma(3, 3) 


(2.90-3.11) 




0.25 


Gamma(3, 2) 


(0.248 - 0.252) 


-ln(/3) 


9.21 


Gamma(3, 4) 


(9.12-9.36) 


c 


0.15 


N(0,5) 


(0.148-0.153) 


9 


3.5 


Gamma(3, 4) 


(3.44-3.56) 


•Smax 


10000 


N(10000,2000) 


(9890-10120) 



Table 1: Parameter values used for the "data" simulation, and corresponding prior distributions and 95% central posterior 
credible intervals estimated from data without observation error. 




Figure 1 : Schematic of deterministic model of the transmission of Bd in a frog population with a description of parameters 
included in the model. 
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Figure 2: One realization of the simulation model shown in Figure [I] Shown are sporangia loads on frogs all frogs (N=30, 
solid black lines), and the zoospore level in the pool (red dotted line) over the course of the virtual experiment. Each solid line 
represents the sporangia load on an individual frog. When the sporangia load reaches a maximum (here 10000), the frog dies, 
and the sporangia load on that frog immediately drops to zero. 
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Figure 4: Histograms of marginal posterior densities of the model parameters estimated from data without observation error. 
The "true" value of the parameter is indicated with a red diamond. Prior distributions are shown as dotted red lines. 
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Figure 5: Samples from the posterior distributions of parameters when data observed sampled with a small amount of (top two 
rows) normally distributed noise (N(0,10)) and (bottom two rows) Log-Normally distributed noise (logN(0,0.01)) The "true" 
value of the parameter is indicated with a red diamond. Prior distributions are shown as dotted red lines. See Section [23| for 
details. 
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